Search CORE

6 research outputs found

E3: A Framework for Compiling C++ Programs with Encrypted Operands

Author: Eduardo Chielle
Homer Gamil
Michail Maniatakos
Nektarios Georgios Tsoutsos
Oleg Mazonka
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 28/05/2021
Field of study

In this technical report we describe E3 (Encrypt-Everything-Everywhere), a framework which enables execution of standard C++ code with homomorphically encrypted variables. The framework automatically generates protected types so the programmer can remain oblivious to the underlying encryption scheme. C++ protected classes redefine operators according to the encryption scheme effectively making the introduction of a new API unnecessary. At its current version, E3 supports a variety of homomorphic encryption libraries, batching, mixing different encryption schemes in the same program, as well as the ability to combine modular computation and bit-level computation

Cryptology ePrint Archive

CoFHEE: A Co-processor for Fully Homomorphic Encryption Execution

Author: Ashraf Mohammed
Chielle Eduardo
Gamil Homer
Gebremichael Mizan Abraha
Karri Ramesh
Maniatakos Michail
Nabeel Mohammed
Sanduleanu Mihai
Soni Deepraj
Publication venue
Publication date: 19/04/2022
Field of study

The migration of computation to the cloud has raised privacy concerns as sensitive data becomes vulnerable to attacks since they need to be decrypted for processing. Fully Homomorphic Encryption (FHE) mitigates this issue as it enables meaningful computations to be performed directly on encrypted data. Nevertheless, FHE is orders of magnitude slower than unencrypted computation, which hinders its practicality and adoption. Therefore, improving FHE performance is essential for its real world deployment. In this paper, we present a year-long effort to design, implement, fabricate, and post-silicon validate a hardware accelerator for Fully Homomorphic Encryption dubbed CoFHEE. With a design area of

12mm^2

, CoFHEE aims to improve performance of ciphertext multiplications, the most demanding arithmetic FHE operation, by accelerating several primitive operations on polynomials, such as polynomial additions and subtractions, Hadamard product, and Number Theoretic Transform. CoFHEE supports polynomial degrees of up to

n = 2^{14}

with a maximum coefficient sizes of 128 bits, while it is capable of performing ciphertext multiplications entirely on chip for

n \leq 2^{13}

. CoFHEE is fabricated in 55nm CMOS technology and achieves 250 MHz with our custom-built low-power digital PLL design. In addition, our chip includes two communication interfaces to the host machine: UART and SPI. This manuscript presents all steps and design techniques in the ASIC development process, ranging from RTL design to fabrication and validation. We evaluate our chip with performance and power experiments and compare it against state-of-the-art software implementations and other ASIC designs. Developed RTL files are available in an open-source repository

arXiv.org e-Print Archive

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

Author: Badawi Ahmad Al
Canida Kellie
Cousins David Bruce
French Matthew
Gamil Homer
Jacob Ajey
Jaiswal Akhilesh
Maniatakos Michail
Mathew Clynn
Neda Negar
Polyakov Yuriy
Reagen Brandon
Reynwar Benedict
Schmidt Andrew
Soni Deepraj
Publication venue
Publication date: 11/04/2023
Field of study

Secure computation is of critical importance to not only the DoD, but across financial institutions, healthcare, and anywhere personally identifiable information (PII) is accessed. Traditional security techniques require data to be decrypted before performing any computation. When processed on untrusted systems the decrypted data is vulnerable to attacks to extract the sensitive information. To address these vulnerabilities Fully Homomorphic Encryption (FHE) keeps the data encrypted during computation and secures the results, even in these untrusted environments. However, FHE requires a significant amount of computation to perform equivalent unencrypted operations. To be useful, FHE must significantly close the computation gap (within 10x) to make encrypted processing practical. To accomplish this ambitious goal the TREBUCHET project is leading research and development in FHE processing hardware to accelerate deep computations on encrypted data, as part of the DARPA MTO Data Privacy for Virtual Environments (DPRIVE) program. We accelerate the major secure standardized FHE schemes (BGV, BFV, CKKS, FHEW, etc.) at >=128-bit security while integrating with the open-source PALISADE and OpenFHE libraries currently used in the DoD and in industry. We utilize a novel tile-based chip design with highly parallel ALUs optimized for vectorized 128b modulo arithmetic. The TREBUCHET coprocessor design provides a highly modular, flexible, and extensible FHE accelerator for easy reconfiguration, deployment, integration and application on other hardware form factors, such as System-on-Chip or alternate chip areas.Comment: 6 pages, 5figures, 2 table

arXiv.org e-Print Archive

RPU: The Ring Processing Unit

Author: Ahmad Al Badawi
Andrew Schmidt
Benedict Reynwar
Benjamin Heyman
Brandon Reagen
David Bruce Cousins
Deepraj Soni
Franz Franchetti
Homer Gamil
Kellie Canida
Massoud Pedram
Matthew French
Michail Maniatakos
Mohammed Nabeel Thari Moopan
Naifeng Zhang
Negar Neda
Yuriy Polyakov
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 30/03/2023
Field of study

Ring-Learning-with-Errors (RLWE) has emerged as the foundation of many important techniques for improving security and privacy, including homomorphic encryption and post-quantum cryptography. While promising, these techniques have received limited use due to their extreme overheads of running on general-purpose machines. In this paper, we present a novel vector Instruction Set Architecture (ISA) and microarchitecture for accelerating the ring-based computations of RLWE. The ISA, named B512, is developed to meet the needs of ring processing workloads while balancing high-performance and general-purpose programming support. Having an ISA rather than fixed hardware facilitates continued software improvement post-fabrication and the ability to support the evolving workloads. We then propose the ring processing unit (RPU), a high-performance, modular implementation of B512. The RPU has native large word modular arithmetic support, capabilities for very wide parallel processing, and a large capacity high-bandwidth scratchpad to meet the needs of ring processing. We address the challenges of programming the RPU using a newly developed SPIRAL backend. A configurable simulator is built to characterize design tradeoffs and quantify performance. The best performing design was implemented in RTL and used to validate simulator performance. In addition to our characterization, we show that a RPU using 20.5mm2 of GF 12nm can provide a speedup of 1485x over a CPU running a 64k, 128-bit NTT, a core RLWE workloa

Cryptology ePrint Archive

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

Author: Ahmad Al Badawi
Ajey Jacob
Akhilesh Jaiswal
Andrew Schmidt
Benedict Reynwar
Bo Zhang
Brandon Reagen
Clynn Mathew
David Bruce Cousins
Deepraj Soni
Franz Franchetti
Homer Gamil
Jeremy Johnson
Kellie Canida
Massoud Pedram
Matthew French
Michail Maniatakos
Mike Franusich
Naifeng Zhang
Negar Neda
Patrick Brinich
Patrick Broderick
Yuriy Polyakov
Zeming Cheng
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/04/2023
Field of study

Cryptology ePrint Archive

Accelerating Fully Homomorphic Encryption by Bridging Modular and Bit-Level Arithmetic

Author: Chielle Eduardo
Gamil Homer
Maniatakos Michail
Mazonka Oleg
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/07/2022
Field of study

The dramatic increase of data breaches in modern computing platforms has emphasized that access control is not sufficient to protect sensitive user data. Recent advances in cryptography allow end-to-end processing of encrypted data without the need for decryption using Fully Homomorphic Encryption (FHE). Such computation however, is still orders of magnitude slower than direct (unencrypted) computation. Depending on the underlying cryptographic scheme, FHE schemes can work natively either at bit-level using Boolean circuits, or over integers using modular arithmetic. Operations on integers are limited to addition/subtraction and multiplication. On the other hand, bit-level arithmetic is much more comprehensive allowing more operations, such as comparison and division. While modular arithmetic can emulate bit-level computation, there is a significant cost in performance. In this work, we propose a novel method, dubbed bridging, that blends faster and restricted modular computation with slower and comprehensive bit-level computation, making them both usable within the same application and with the same cryptographic scheme instantiation. We introduce and open source C++ types representing the two distinct arithmetic modes, offering the possibility to convert from one to the other. Experimental results show that bridging modular and bit-level arithmetic computation can lead to 1-2 orders of magnitude performance improvement for tested synthetic benchmarks, as well as one real-world FHE application: a genotype imputation case study

arXiv.org e-Print Archive